Introduction to antigenic cartography

This page contains the code to make antigenic maps and antibody landscapes. You can open the R Markdown file in RStudio to have an interactive step-by-step document and chose the “Visual” display instead of the “Source” display if you prefer it.

The data used here was published in Roessler et al., Nat Comm (2022), publicly available in the paper’s GitHub repository and in this GitHub repo. If you use the data, please cite the original publication. This data contains antibody neutralization titers. However, maps can also be made from binding data or hemagglutinin inhibition data.

If you use materials from this page, please use this repository’s DOI as reference and cite the Racmacs and/or ablandscapes package.

Useful resources

Scientific background on antigenic cartography can be found in Smith et al., Science (2004) and on antibody landscapes in Fonville et al., Science (2014).

A detailed introduction to the R package for making antigenic maps is given on the Racmacs reference page.

To run the Racmacs package and construct antigenic maps and antibody landscapes, we use R: R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.

For examples of map diagnostic steps you can look at the Supplementary Material in Roessler et al., Nat Comm (2022) and Roessler et al., Nat Comm (2023).

Antigenic maps were constructed using the Racmacs package, Version 1.1.35: Wilks S (2022). Racmacs: R Antigenic Cartography Macros. https://acorg.github.io/Racmacs, https://github.com/acorg/Racmacs.

Antibody landscapes were constructed using the ablandscapes package, Version 1.1.0: Wilks S (2021). ablandscapes: Making Antibody landscapes Using R. R package version 1.1.0, https://github.com/acorg/ablandscapes.

We recommend calculating geometric mean titers and fold changes using the titertools package to deal with censored data: Wilks SH (2022). titertools: A statistical toolkit for the annalysis of censored titration data. R package version 0.0.0.9001.

Theoretical background

If the slideshow does not render for you, try opening it in another browser. It should open in Chrome. Alternatively, you can open the pdf ./Introduction_to_antigenic_cartography.pdf. This slideshow gives a brief overview of what is demonstrated with real world data in the remaining document.

Making an antigenic map

# these are the packages used in the notebook and should be installed for it to run smoothly
library(tidyverse)
library(Racmacs)
library(ablandscapes)
library(Racmacs)
library(dplyr)
library(r3js)
library(knitr)
library(kableExtra)
library(patchwork)
#install titertools from GitHub: https://github.com/shwilks/titertools.git if you want to estimate <LOD values for GMT calculation
# library(titertools) 

Antigenic maps should be created from single exposure sera or baseline vaccination sera. With just this type of sera (e.g. only 2 dose vaccinated), the map will be poorly triangulated, meaning the positional resolution of variants will be very low. Sera will not be positioned accurately, or at all if titers against the majority of variants are below the limit of detection (<LOD). Homologous sera from the different variants are needed to add resolution. The map by Roessler et al. (Nat Comm 2022) contains many different human first infection sera.

Importantly, antigenic maps always reflect the data that was used to construct them. So two maps with different types of input sera and variants may look different. However, if enough sera and variants are present in the input data to triangulate the items in euclidean space, maps from different source data should be consistent. Hence maps that look different might be an accurate representation of the input data, but not an accurate representation of antigenic space due to lack of diversity in the input (demonstrated below). To assess the suitability of input data for antigenic maps, map diagnostic steps are important.

Read in data

Set the working directory to the root directory of your project for convenience.

Antigenic maps deal with titers below the limit of detection (<LOD) by allowing the euclidean map distance to be larger than or equal to the titer target distance to the LOD. This utilizes the information that is contained in this measurement: We know the distance is at least as much as the maximum titer to LOD distance. Here, the LOD was 16.

The input for making a map is a table of measured titers. The rows correspond to the titrated antigens, the columns to the measured sera. Here, I will extract a table from an existing map and demonstrate how to create a map from this table.

working_dir <- "~/Documents/smith/labbook/antigenic_cartography_introduction/"

# set working directory
setwd(working_dir)


# read in data for multiple exposure landscapes
multi_exposure_data_b <- read.csv("data/titer_data/titer_data_long.csv") %>%
  mutate(sr_group = gsub("\\/alpha\\+E484K", "", sr_group)) 

# go to Additional section to find out more
multi_exposure_data <- multi_exposure_data_b %>%
  filter(ag_name != "P.1.1")

# Racmacs function to read in antigenic map of .ace format
og_map <- read.acmap("data/maps/map-OmicronI+II+III-thresholded-single_exposure-P1m1.ace")

# extract Titer Table from map
titer_data_og <- titerTable(removeAntigens(og_map, "P.1.1"))

titer_data <- apply(titer_data_og, 2, function(x){
  temp <- as.character(round(as.numeric(x)))
  temp[is.na(temp)] <- "<16"
  return(temp)
})

rownames(titer_data) <- rownames(titer_data_og)

# shorten colnames of samples
colnames(titer_data) <- sapply(colnames(titer_data), function(x){
  paste(str_split(x, "_")[[1]][1:2], collapse = "_")
})
colnames(titer_data) <- gsub("\\/alpha\\+E484K", "", colnames(titer_data))

# read in colors for map
sr_group_colors <- read.csv("data/metadata/sr_group_colors.csv", sep = ";")
mapColors <- read.csv(file = './data/metadata/map-colors.csv', row.names = 'Antigen', header = TRUE)
mapColors[sr_group_colors$SerumGroup, "Color"] <- sr_group_colors$Color 


# show titer table
kable(titer_data, "html", align = "r") %>%
  kable_styling("striped", full_width = F)
G21_delta conv. G22_delta conv. G23_delta conv. G24_delta conv. G25_delta conv. G26_delta conv. G27_delta conv. F620_alpha conv. F628_alpha conv. F633_alpha conv. F635_alpha conv. F647_alpha conv. F650_alpha conv. F658_alpha conv. F661_alpha conv. F662_alpha conv. F667_alpha conv. C701_beta conv. C709_beta conv. C711_beta conv. C715_beta conv. C770_beta conv. C850_beta conv. C859_beta conv. C860_beta conv. G30_mRNA1273/mRNA1273 G33_mRNA1273/mRNA1273 G34_mRNA1273/mRNA1273 G36_mRNA1273/mRNA1273 G37_mRNA1273/mRNA1273 G38_mRNA1273/mRNA1273 G39_mRNA1273/mRNA1273 G40_mRNA1273/mRNA1273 G41_mRNA1273/mRNA1273 G42_mRNA1273/mRNA1273 E916_AZ/AZ E918_AZ/AZ E993_AZ/AZ E995_AZ/AZ E998_AZ/AZ E999_AZ/AZ E1000_AZ/AZ F5_AZ/AZ F6_AZ/AZ F9_AZ/AZ E919_AZ/BNT E920_AZ/BNT E921_AZ/BNT F41_AZ/BNT E994_AZ/BNT E996_AZ/BNT E997_AZ/BNT F1_AZ/BNT F2_AZ/BNT F3_AZ/BNT F110_BNT/BNT F120_BNT/BNT F121_BNT/BNT F122_BNT/BNT F124_BNT/BNT F262_BNT/BNT F263_BNT/BNT F269_BNT/BNT F271_BNT/BNT F273_BNT/BNT F289_BNT/BNT G291_BA.1 conv. G347_BA.1 conv. G348_BA.1 conv. G353_BA.1 conv. G354_BA.1 conv. G355_BA.1 conv. G356_BA.1 conv. G359_BA.1 conv. G360_BA.1 conv. G362_BA.1 conv. G363_BA.1 conv. G366_BA.1 conv. G369_BA.1 conv. G371_BA.1 conv. G381_BA.1 conv. G393_BA.1 conv. G647_BA.1 conv. G650_BA.1 conv. G665_BA.2 conv. G668_BA.2 conv. G669_BA.2 conv. G670_BA.2 conv. G671_BA.2 conv. G672_BA.2 conv. G673_BA.2 conv. G674_BA.2 conv. G751_BA.2 conv. G776_BA.2 conv. G780_BA.2 conv. G782_BA.2 conv. 218 (Ischgl 1)_WT conv. 220 (Ischgl 1)_WT conv. 224 (Ischgl 1)_WT conv. 260 (Ischgl 1)_WT conv. 262 (Ischgl 1)_WT conv. 278 (Ischgl 1)_WT conv. 279 (Ischgl 1)_WT conv. 280 (ischgl 1)_WT conv. 298 (ischgl 1)_WT conv. 299 (Ischgl 1)_WT conv.
D614G 559 901 85 114 <16 <16 43 255 27 56 98 21 23 19 148 171 120 307 <16 <16 22 139 150 529 <16 926 442 1631 1518 589 180 923 1185 97 348 198 36 325 649 52 43 140 120 839 148 1186 859 2106 915 2403 1003 512 1244 3582 1594 1159 1204 1495 654 1988 357 1592 560 2567 981 689 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 17 <16 20 <16 <16 22 <16 <16 <16 <16 <16 <16 26 788 204 17 165 200 463 123 132 75 63 200 132 124
B.1.1.7 415 665 66 153 <16 <16 41 353 167 100 479 38 158 184 533 551 432 341 <16 <16 48 642 37 173 <16 867 136 398 470 363 41 389 856 80 425 149 36 197 218 58 63 97 79 948 137 1864 868 1823 404 1642 1446 593 514 3849 843 923 531 1723 369 1020 315 1738 793 601 361 246 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 36 <16 <16 <16 <16 54 <16 <16 <16 <16 49 <16 <16 448 247 <16 155 60 357 228 32 88 47 200 58 66
B.1.1.7+E484K 16 450 96 117 <16 <16 70 345 142 56 361 <16 130 75 265 266 401 789 137 <16 72 598 139 723 <16 441 178 556 248 192 96 368 397 41 236 74 17 281 124 26 35 63 76 79 130 537 347 424 171 580 402 302 346 4433 719 1269 625 1272 213 796 353 754 81 473 279 240 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 128 <16 20 <16 <16 <16 <16 <16 <16 <16 40 <16 <16 280 210 <16 283 38 320 392 23 <16 36 26 54 <16
B.1.351 17 1396 164 33 <16 <16 40 156 68 54 429 <16 219 137 114 205 305 909 102 <16 533 2014 634 2025 <16 130 104 614 227 75 24 225 330 45 279 147 <16 251 300 71 30 47 87 67 91 512 457 466 199 709 387 477 226 796 397 235 108 643 778 325 133 615 68 185 176 116 27 <16 <16 <16 <16 <16 <16 24 <16 25 <16 <16 <16 <16 25 <16 <16 <16 <16 39 <16 <16 <16 <16 47 <16 <16 178 97 <16 228 108 178 87 22 <16 22 25 45 <16
B.1.617.2 448 632 126 2726 227 91 146 78 50 77 89 <16 18 17 43 116 133 39 <16 <16 <16 111 <16 254 <16 713 67 484 414 243 31 400 314 24 107 181 25 122 274 191 24 44 101 182 129 716 639 873 291 895 536 294 185 864 458 237 178 349 182 344 155 293 191 384 195 126 <16 <16 <16 <16 <16 <16 <16 47 30 <16 <16 53 <16 <16 20 <16 178 <16 <16 21 <16 20 36 <16 44 <16 47 231 275 <16 257 38 569 259 94 93 36 223 48 51
BA.1 <16 <16 <16 17 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 26 <16 <16 <16 <16 <16 <16 16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 <16 62 <16 32 <16 42 25 <16 <16 69 23 28 <16 19 <16 50 <16 65 <16 18 <16 <16 106 31 68 <16 44 21 34 122 76 54 144 42 51 80 140 <16 365 52 30 <16 <16 21 28 <16 <16 <16 45 111 <16 <16 <16 <16 109 <16 <16 <16 <16 <16 <16 22
BA.2 30 113 58 34 <16 <16 83 24 17 <16 33 <16 <16 <16 18 56 98 24 <16 <16 30 100 20 <16 <16 <16 85 273 112 184 <16 115 137 <16 120 46 <16 34 <16 <16 <16 147 48 30 38 257 146 351 162 440 125 217 1024 632 160 254 151 501 82 273 <16 186 <16 309 95 109 17 <16 <16 <16 <16 <16 46 21 23 37 51 113 27 37 204 <16 693 44 391 88 1101 553 335 96 157 120 525 548 69 1055 40 41 170 149 <16 <16 <16 <16 25 <16
BA.5 <16 447 68 67 <16 <16 21 21 <16 <16 52 <16 <16 <16 17 141 233 <16 <16 <16 26 177 <16 <16 <16 18 23 91 84 25 <16 29 53 <16 55 <16 <16 <16 <16 <16 <16 21 <16 <16 26 176 56 202 38 67 60 57 66 343 50 47 28 36 38 83 28 68 <16 91 22 47 <16 <16 <16 <16 <16 <16 <16 29 <16 <16 <16 <16 <16 26 28 <16 33 <16 56 50 17 52 41 37 66 24 69 447 47 27 74 <16 72 62 <16 <16 <16 <16 26 <16

Let’s have a look at the titer data:

multi_exposure_data %>%
  # log2 transform the data and set <LOD values to LOD/2
  mutate(logtiter = ifelse(titer == "<16", log2(16/20), log2(as.numeric(titer)/10)),
         sr_group = factor(sr_group, levels = sr_group_colors$SerumGroup)) -> data_long


# calculate
data_long %>%
  group_by(
    sr_group,
    ag_name
  ) %>%
  # if you want to use quick LOD/2 method for GMT calculation
  summarize(logtiter = mean(logtiter, na.rm = TRUE)) %>%
  #  We recommend using the titertools package for GMT/fold change calculation where <LOD values are estimated in 
  # a bayesian approach. As it runs a bayesian model in the background, it takes a bit longer to run. The code to run
  # it is commented out below. It returns the mean, lower and upper estimates of the mean and of the standard deviation
  # summarize(gmt = titertools::gmt(titer, dilution_stepsize = 0)["mean", "estimate"]) %>%
  # manually set GMT's that are lower than that to LOD2
  mutate(logtiter = ifelse(logtiter < log2(0.8), log2(0.8), logtiter),
         sr_name = "GMT",
         gmt = logtiter)-> gmt_data

do_titerplot <- function(data_long, target_sr_groups, gmt_data, sr_group_colors, ag_order){
  
  data_long %>%
  filter(sr_group %in% target_sr_groups) %>%
  mutate(logtiter = ifelse(titer == "<16", log2(16/20), log2(as.numeric(titer)/10))) %>%
  ggplot(aes(x = ag_name, y = logtiter, color = sr_group, group = sr_name)) + 
  geom_line(aes(group = sr_name), alpha = 0.3) + 
  geom_point(alpha = 0.3) + 
  geom_line(data = gmt_data %>%
              filter(sr_group %in% target_sr_groups), size = 1.5) + 
  geom_point(data = gmt_data %>%
               filter(sr_group %in% target_sr_groups), fill = "white", shape = 21, size = 3) +
  scale_color_manual(values = setNames(sr_group_colors$Color, sr_group_colors$SerumGroup),
                     name = "Serum group") +
  scale_x_discrete(limits = ag_order,
                   name = "Variant") + 
  scale_y_continuous(name = "Titer",
                     labels = function(x) round(2^x*10),
                     breaks = seq(log2(16/10), 12, 1)) +
  annotate("rect",
                xmin = -Inf,
                xmax = Inf,
                ymin = -Inf,
                ymax = log2(16/10),
            fill = "grey",
            alpha = 0.6,
            color = NA) +
  facet_wrap(~sr_group) + 
  theme_bw() + 
  theme(strip.background = element_blank(),
        axis.text.x = element_text(angle = 45, hjust = 1)) -> p
  
  return(p)
  
}

single_exposure_groups <- c('delta conv.', 'alpha conv.', 'beta conv.', 'mRNA1273/mRNA1273', 'AZ/AZ', 'AZ/BNT', 'BNT/BNT', 'BA.1 conv.', 'BA.2 conv.', 'WT conv.')

# plot the data
do_titerplot(data_long, target_sr_groups = single_exposure_groups,
              gmt_data = gmt_data, 
             sr_group_colors = sr_group_colors,
             ag_order = rownames(titer_data))

Individual titers are shown as faint lines, geometric mean titers (GMT) as thick lines. We can see some trends, for example BA.1’s escape in most serum groups.

# set seed for random optimization start
set.seed(100)

# create the acmap object
map <- acmap(
    ag_names = rownames(titer_data),
    sr_names = colnames(titer_data),
    titer_table = titer_data)

# The dilution stepsize gives the dilution steps of the neutralization assay on the log2 scale, e.g.: 0 = continuous read out, 1 = two-fold dilutions
dilutionStepsize(map) <- 0

# optimize the map in 2 dimensions 
map <- optimizeMap(
    map,
    number_of_dimensions = 2,
    number_of_optimizations = 1000,
    minimum_column_basis = "none",
    options = list(ignore_disconnected = TRUE) # Map contains disconnected points (points that are not connected through any path of detectable titers so cannot be coordinated relative to each other)
  )
plot(map, plot_labels = "antigens")

Above is a basic map. We will add meta-information in the next step to make it interpretable:

# Add fill color for map
agFill(map) <- mapColors[agNames(map),]
agGroups(map) <- agNames(map)

# Add serum group = serum cohort to map. It is stored here in the sample name, the second field after "_"
sr_groups <- unlist(lapply(srNames(map), function(x) str_split(x, "_")[[1]][2]))
srGroups(map) <- factor(sr_groups, levels = sr_group_colors$SerumGroup)
srOutline(map) <- mapColors[as.character(srGroups(map)),]

# Add general style guidelines to map
srOutlineWidth(map) <- 1
srSize(map) <- 9
agSize(map) <- 18
srOutline(map) <- adjustcolor(srOutline(map), alpha.f = 0.7)

# make some antigens smaller than others
sublineages <- c("B.1.1.7+E484K")
agSize(map)[agNames(map) %in% sublineages] <- 15

# align map to example map. Important: antigen names have to match
map <- realignMap(map, og_map)

# get map limits for plotting
lims_no_zoom <- Racmacs:::mapPlotLims(map, sera = TRUE)

xlim_no_zoom <- round(lims_no_zoom$xlim)
ylim_no_zoom <- round(lims_no_zoom$ylim)


lims_zoom <- Racmacs:::mapPlotLims(map, sera = FALSE)

xlim_zoom <- round(lims_zoom$xlim)
ylim_zoom <- round(lims_zoom$ylim)

# plot map
layout(matrix(c(1:2), ncol = 2, byrow = FALSE))
par(oma=c(0, 0, 0, 0), mar=c(0, 0.5, 0, 0))
plot(map, plot_labels = "antigens", label.offset = 1.5, xlim = xlim_zoom, ylim = ylim_zoom, plot_stress = TRUE)
plot(map, plot_labels = "antigens", label.offset = 1.5, xlim = xlim_no_zoom, ylim = ylim_no_zoom, plot_stress = TRUE)

Above we see the same antigenic map, once zoomed in (left) and once zoomed out (right). Variants are shown as coloured circles, sera are shown as open squares in the colour of their root variant (or as triangles pointing toward their position if they outside of the shown map area). Characteristics of a single exposure map are that sera are located close to their infecting/homologous variant as they have the highest titer against this variant. Hence, the antigenic distance to this variant is 0. All other variants are located at the Euclidean distance that best matches the log2(fold change) of titers from the homologous variant to this variant. However, you will see that sera are not at exactly the same position as their homologous variant. That is because the optimization algorithm finds the optimum for all serum and variant coordinates to best match the titer fold changes. As there is titer variation between individual samples, there will be variation in their positions. The mismatch between target titer distance (titer fold changes) and euclidean map distance (antigenic distance) is given by the map stress in the lower left corner. The lower the stress, the better the match between target and map distance.

Interactive viewer

Racmacs provides an interactive viewer with featuers such as map randomization, reoptimization, movement of individual objects, map stress investigation. To properly use these featuers, non-positioned sera and variants (due to too many not measured/undetectable) titers need to be removed.

If the viewer does not properly render here, you can open it in any R session.

remove_na_sera <- function(map){
  
  na_sera <- srCoords(map)
  na_sera <- rownames(na_sera)[is.na(na_sera)[,1]]
  
  map <- removeSera(map, na_sera)
  
  return(map)
}

remove_na_antigens <- function(map){
  
  na_sera <- agCoords(map)
  na_sera <- rownames(na_sera)[is.na(na_sera)[,1]]
  
  map <- removeAntigens(map, na_sera)
  
  return(map)
}

remove_na_coords <- function(map){
  
  map <- remove_na_sera(map)
  map <- remove_na_antigens(map)
  
  return(map)
  
}

Racmacs::view(remove_na_coords(map))

Map uncertainty

The plot below shows how much each variant and serum is allowed to move without increasing the map stress by more than one unit. It illustrates that more >LOD titers in different types of sera increase the positional resolution (e.g. the blue D614G has >LOD in almost all sera, whereas BA.1 has <LOD in almost all sera). For this diagnostic step, again sera that could not be positioned need to be removed from the map.

# triangulation does not work with not-positioned sera - sera cannot be position if too many titers are <LOD
plot(triangulationBlobs(relaxMap(remove_na_coords(map)), stress_lim = 1, grid_spacing = 0.05), fill.alpha = 0.9, plot_labels = "antigens", outline.alpha = 0.9, xlim = xlim_zoom, ylim = ylim_zoom,
       grid.col = "#cfcfcf", grid.margin.col="#7d7d7d", cex=0.3, plot_stress = TRUE)

To demonstrate the importance of different types of sera on map conformation and positional resolution, I will now create maps subset to

  1. only 2x Pfizer vaccine sera,

  2. previous + alpha sera (different homologous variant but similar antigenic profile)

  3. previous + beta sera (sera with more distinct antigenic profile)

  4. previosu + BA.1 sera (sera with very distinct antigenic profile)

subset_and_optimize_map <- function(full_map, kept_sr_groups){
  
  map <- removeSera(full_map, srNames(full_map)[!as.character(srGroups(full_map)) %in% kept_sr_groups])
  
  map <- optimizeMap(
    map,
    number_of_dimensions = 2,
    number_of_optimizations = 1000,
    minimum_column_basis = "none",
    options = list(ignore_disconnected = TRUE)
  )
  
  map <- realignMap(relaxMap(remove_na_coords(map)), full_map)
  
  srOutlineWidth(map) <- 2
  srSize(map) <- 9*4
  agSize(map) <- 18*4
  srOutline(map) <- adjustcolor(srOutline(map), alpha.f = 0.7)

  return(map)
  
}


vax_map <- subset_and_optimize_map(map, c("BNT/BNT"))
alpha_map <- subset_and_optimize_map(map, c("BNT/BNT", "alpha conv."))
beta_map <- subset_and_optimize_map(map, c("BNT/BNT", "alpha conv.", "beta conv."))
ba1_map <- subset_and_optimize_map(map, c("BNT/BNT", "alpha conv.", "BA.1 conv.", "beta conv."))

ylim_no_zoom[2] <- ylim_no_zoom[2] + 1

layout(matrix(c(1:8), ncol = 4, byrow = FALSE))
par(oma=c(0, 0, 0, 0), mar=c(0.5, 0, 0.1, 0))

plot(procrustesMap(vax_map, map, sera = FALSE), fill.alpha = 0.9, plot_labels = FALSE, outline.alpha = 0.9, xlim = xlim_no_zoom, ylim = ylim_no_zoom, grid.col = "#cfcfcf", grid.margin.col="#7d7d7d", cex=0.3, plot_stress = TRUE)
plot(triangulationBlobs(vax_map, stress_lim = 1, grid_spacing = 0.05), fill.alpha = 0.9, plot_labels = FALSE, outline.alpha = 0.9, xlim = xlim_no_zoom, ylim = ylim_no_zoom,
       grid.col = "#cfcfcf", grid.margin.col="#7d7d7d", cex=0.3, plot_stress = TRUE)


plot(procrustesMap(alpha_map, map, sera = FALSE), fill.alpha = 0.9, plot_labels = FALSE, outline.alpha = 0.9, xlim = xlim_no_zoom, ylim = ylim_no_zoom, grid.col = "#cfcfcf", grid.margin.col="#7d7d7d", cex=0.3, plot_stress = TRUE)
plot(triangulationBlobs(alpha_map, stress_lim = 1, grid_spacing = 0.05), fill.alpha = 0.9, plot_labels = FALSE, outline.alpha = 0.9, xlim = xlim_no_zoom, ylim = ylim_no_zoom,
       grid.col = "#cfcfcf", grid.margin.col="#7d7d7d", cex=0.3, plot_stress = TRUE)

plot(procrustesMap(beta_map, map, sera = FALSE), fill.alpha = 0.9, plot_labels = FALSE, outline.alpha = 0.9, xlim = xlim_no_zoom, ylim = ylim_no_zoom, grid.col = "#cfcfcf", grid.margin.col="#7d7d7d", cex=0.3, plot_stress = TRUE)
plot(triangulationBlobs(beta_map, stress_lim = 1, grid_spacing = 0.05), fill.alpha = 0.9, plot_labels = FALSE, outline.alpha = 0.9, xlim = xlim_no_zoom, ylim = ylim_no_zoom,
       grid.col = "#cfcfcf", grid.margin.col="#7d7d7d", cex=0.3, plot_stress = TRUE)


plot(procrustesMap(ba1_map, map, sera = FALSE), fill.alpha = 0.9, plot_labels = FALSE, outline.alpha = 0.9, xlim = xlim_no_zoom, ylim = ylim_no_zoom, grid.col = "#cfcfcf", grid.margin.col="#7d7d7d", cex=0.3, plot_stress = TRUE)
plot(triangulationBlobs(ba1_map, stress_lim = 1, grid_spacing = 0.05), fill.alpha = 0.9, plot_labels = FALSE, outline.alpha = 0.9, xlim = xlim_no_zoom, ylim = ylim_no_zoom,
       grid.col = "#cfcfcf", grid.margin.col="#7d7d7d", cex=0.3, plot_stress = TRUE)

From left to right:

  1. only 2x Pfizer vaccine sera,

  2. previous + alpha sera (different homologous variant but similar antigenic profile)

  3. previous + beta sera (sera with more distinct antigenic profile)

  4. previosu + BA.1 sera (sera with very distinct antigenic profile)

The top row shows the antigenic maps with arrows pointing to the variants’ positions in the full map, the bottom row shows the constant force loci (triangulation blobs).

The third map gives a very similar confirmation as the “full map” but the positional resolution of the Omicron variants is very low.

Making antibody landscapes

Multiple exposures increase cross-neutralization and hence fold changes of titers between variants are less pronounced. Creating a map from such sera results in a map where variants are closer together than in single exposure sera maps and an underestimation of antigenic relationships. Multiple exposures do not change the antigenic relationships of virus strains, but change how cross-reactive sera are. We can see that by comparing fold changes from the homologous titer variant after 2 and 3 Pfizer vaccines:

# get GMT fold change from D614G
gmt_data %>%
  filter(sr_group %in% c("BNT/BNT", "BNT/BNT/BNT")) %>%
  group_by(sr_group) %>%
  mutate(log_fc = logtiter[ag_name == "D614G"] - logtiter,
         fc = ifelse(log_fc > 0, paste0("-", round(2^abs(log_fc), 1)), round(2^abs(log_fc), 1))) -> fc_data

do_titerplot(data_long,
             target_sr_groups = c("BNT/BNT", "BNT/BNT/BNT"),
             gmt_data,
             sr_group_colors,
             ag_order = rownames(titer_data)) + 
  geom_text(data = fc_data, aes(label = fc), y = 10.5) + 
  geom_text(data = fc_data, aes(label = abs(round(log_fc, 1))), y = 10)

The top row of numbers lists the GMT fold change from D614G, the row below the log2(fold change) and thereby the antigenic distance. The antigenic distance difference is the most pronounced for the most distant Omicron variants.

To illustrate this effect on an antigenic map, we will create a map from 3x Pfizer vaccine sera and compare it to the 2x Pfizer vaccine sera:

# make table for 3x Pfizer sera
multi_exposure_data %>%
  filter(sr_group == "BNT/BNT/BNT") %>%
  select(!sr_group) %>%
  pivot_wider(names_from = "sr_name", values_from = "titer") %>%
  column_to_rownames("ag_name") -> table_3x


# make 3x map
# create the acmap object
map_3x <- acmap(
    ag_names = rownames(table_3x),
    sr_names = colnames(table_3x),
    titer_table = table_3x)

# The dilution stepsize gives the dilution steps of the neutralization assay on the log2 scale, e.g.: 0 = continuous read out, 1 = two-fold dilutions
dilutionStepsize(map_3x) <- 0

# optimize the map in 2 dimensions 
map_3x <- optimizeMap(
    map_3x,
    number_of_dimensions = 2,
    number_of_optimizations = 1000,
    minimum_column_basis = "none",
    options = list(ignore_disconnected = TRUE) # Map contains disconnected points (points that are not connected through any path of detectable titers so cannot be coordinated relative to each other)
  )

map_3x <- realignMap(map_3x, map)
srOutline(map_3x) <- "grey20"
agFill(map_3x) <- mapColors[agNames(map_3x), "Color"]

srOutlineWidth(map_3x) <- 2
srOutline(map_3x) <- adjustcolor(srOutline(map_3x), alpha.f = 0.7)
agSize(map_3x) <- 18*4
srSize(map_3x) <- 9*4
# Do plotting
layout(matrix(c(1:3), ncol = 3, byrow = FALSE))
par(oma=c(0, 0, 0, 0), mar=c(0.1, 0.5,0, 0))

plot(map_3x, fill.alpha = 0.9, plot_labels = FALSE, outline.alpha = 0.9, xlim = xlim_zoom, ylim = ylim_zoom + 0.5, grid.col = "#cfcfcf", grid.margin.col="#7d7d7d", cex=0.3, plot_stress = TRUE)

plot(realignMap(vax_map, map_3x), fill.alpha = 0.9, plot_labels = FALSE, outline.alpha = 0.9, xlim = xlim_zoom, ylim = ylim_zoom+0.5, grid.col = "#cfcfcf", grid.margin.col="#7d7d7d", cex=0.3, plot_stress = TRUE)

plot(procrustesMap(map_3x, vax_map, sera = FALSE), fill.alpha = 0.9, plot_labels = FALSE, outline.alpha = 0.9, xlim = xlim_zoom, ylim = ylim_zoom + 0.5, grid.col = "#cfcfcf", grid.margin.col="#7d7d7d", cex=0.3, plot_stress = TRUE)

On the left, we see the map from 3x Pfizer vaccine sera, the middle map is the 2x Pfizer vaccine map and on the right, the 3x Pfizer map is shown with arrows pointing to the variants’ positions in the 2x Pfizer map. The arrows point outside, indicating that the variants are closer to each other in the 3x Pfizer map, and therefore look more antigenically similar than in the 2x Pfizer map. However, the virus isolates used in the assay are the same. Again, it is not the antigenic relationships and properties of the viruses that change with more exposures, but the antibody composition in the multi-exposure sera.

Antibody landscapes

Antibody landscapes better represent multiple exposure sera. Neutralization titers are mapped as a continuous surface above an antigenic map: Titers against the respective variant are mapped above the variant in a third dimension, the height representing the titer magnitude, and connected through a surface, the slope of which indicates the cross-reactivity. The smaller the slope, the more cross-reactive the sample.

In the single cone landscape approached used here, cone coordinates (x, y) and cone heights (z) are fitted to titers per subject. The slope of this cone is optimised over all subjects per serum group, assuming the same decrease of neutralization across antigenic space. The cone slopes indicate the breadth of neutralization titers. As immune history becomes more complex, for example for influenza, a single cone approach does not provide a good approach and a loess fit should be used.

The description above focusses on antibody neutralization, not binding. The landscape fitting approach remains the same, however.

# orientation for landscapes
angle <- list(
  rotation = c(-1.4681,0.004,-0.0162),
  translation = c(0, 0.05,0.1), 
  zoom = 1.45
)

# a function to get titer tables from long format to matrix form as input for the antibody landscape fit 
get_titertable <- function(data, group) {
  
  data %>% 
    select(
      ag_name,
      sr_name,
      titer
    ) %>%
    mutate(
      titer = replace(titer, is.na(titer), "*")
    ) %>%
    pivot_wider(
      id_cols = sr_name,
      names_from = ag_name,
      values_from = titer
    ) %>% 
    as.matrix() -> titermatrix
  
  attr(titermatrix, "sr_group") <- group$sr_group
  rownames(titermatrix) <- titermatrix[,"sr_name"]
  titermatrix <- titermatrix[,-1]
  
   return(titermatrix)
  
}

# plot the base map as r3js object
base_plot_data3js <- function(map, lndscp_fits, highlighted_ags, lims, ag_plot_names,
                              add_border = TRUE, add_axis = TRUE, add_ag_labels = TRUE){
  
  # get coordinates for variants that should be plotted (highlighted ags)
    x_coords <- c(agCoords(map)[agNames(map) %in% highlighted_ags, 1])
    y_coords <- c(agCoords(map)[agNames(map) %in% highlighted_ags, 2])
    z_coords <- rep(0.02, length(highlighted_ags))
    ag_point_size <- c(rep(14, length(highlighted_ags))) / 5
    ag_col <- c(agOutline(map)[agNames(map) %in% highlighted_ags])
    ag_fill <- c(agFill(map)[agNames(map) %in% highlighted_ags])
    labels <- c(ag_plot_names[agNames(map) %in% highlighted_ags])
    border_col <- "grey50"
  
  z_lims <- c(0,10)
  axis_at <- seq(z_lims[1], z_lims[2],2)
  
  # Setup plot
  data3js <- ablandscapes:::lndscp3d_setup(
    xlim = lims$xlim,
    ylim = lims$ylim,
    zlim = z_lims,
    aspect.z = 0.5,
    options = list(
      lwd.grid =  0.05,
      sidegrid.lwd = 1,
      sidegrid.col = border_col,
      sidegrid.at = list("z" = axis_at),
      zaxt = "log"
    ),
    show.axis = FALSE
  )
  
  # add z axis
  if(add_axis){

    axis_labels <- 2^axis_at*10
    
    data3js <- r3js::axis3js(
      data3js,
      side = "z",
      at = axis_at,
      labels = axis_labels,
      cornerside = "f",
      size = 20,
      alignment = "right"
    )
  }

  # Add basemap
  data3js <- lndscp3d_map(
    data3js = data3js,
    fit = lndscp_fits[[1]],
    xlim = lims$xlim,
    ylim = lims$ylim,
    zlim = c(0, 10),
    show.map.sera = FALSE,
    options = list(
      opacity.basemap = 0.3
    )
  )
  
  # add variants
  data3js <- r3js::points3js(
    data3js,
    x          = x_coords,
    y          = y_coords,
    z          = z_coords,
    size       = ag_point_size,
    col        = ag_col,
    fill       = ag_fill,
    lwd        = 0.5,
    opacity    = 1,
    highlight  = list(col = "red"),
    label      = labels,
    toggle     = "Basepoints",
    depthWrite = FALSE,
    shape      = "circle filled"
  )
  
  if(add_ag_labels){
    text_x <- x_coords
    text_y <- c(y_coords - ag_point_size*0.2)
    
    data3js <- r3js::text3js(
        data3js,
        x          = text_x,
        y          = text_y,
        z          = z_coords,
        text       = labels,
        size       = c(rep(10*0.02, length(text_x))), 
        alignment  = "center"
      )
    
  }
  
  # thicker border of the base map
  if(add_border){
    data3js <- lines3js(data3js, x = c(lims$xlim[1],lims$xlim[1]), y = c(lims$ylim[1], lims$ylim[2]), z = c(0, 0),
                        lwd = 1.2, col = border_col)
    data3js <- lines3js(data3js, x = c(lims$xlim[2],lims$xlim[2]), y = c(lims$ylim[1], lims$ylim[2]), z = c(0, 0),
                        lwd = 1.2, col = border_col)
    
    # y border
    data3js <- lines3js(data3js, x = c(lims$xlim[1],lims$xlim[2]), y = c(lims$ylim[1], lims$ylim[1]), z = c(0, 0),
                        lwd = 1.2, col = border_col)
    data3js <- lines3js(data3js, x = c(lims$xlim[1],lims$xlim[2]), y = c(lims$ylim[2], lims$ylim[2]), z = c(0, 0),
                        lwd = 1.2, col = border_col)

    data3js <- r3js::box3js(
      data3js,
      col   = border_col
    )
    
  }
  

  return(data3js)
}

# plot landscapes from a list that contains the fits
plot_landscapes_from_list <- function(data3js, # base map as r3js object
                                      titertables_groups, # grouped titer table object for serum groups
                                      lndscp_fits, # landscape fits as list
                                      map, # map for x,y coordinates of GMT points
                                      highlighted_ags, # which map antigens should be highlighted (GMT coordinates)
                                      lndscp_colors, # colors for the landscapes
                                      gmt_data = NULL, 
                                      show_gmts = TRUE, 
                                      show.individual.surfaces = FALSE, 
                                      options.individual.surfaces = list(opacity.surface.grid = 0.4,
                                                                         opacity.surface = 0.2, 
                                                                         col.surface = "grey70", 
                                                                         col.surface.grid = "grey70"), 
                                      show_landscapes = TRUE){
  
  
  # get x and y coords for gmt points
    x_coords <- c(agCoords(map)[agNames(map) %in% highlighted_ags, 1])
    y_coords <- c(agCoords(map)[agNames(map) %in% highlighted_ags, 2])
    coords <- cbind(x_coords, y_coords)
    coords <- coords[!is.na(x_coords),]
    
    
    # for each landscape fit object (serum group), do the plotting
  for (i in seq_along(lndscp_fits)) {
    
    # select sr group
    srg <- titertables_groups$sr_group[i]
    lndscp_fit <- lndscp_fits[[i]]
    
    for (j in seq_len(nrow(coords))) {
      
      if(show_gmts){
        
        if(is.null(gmt_data)){
          warning("Error: You want to plot GMT but did not provide GMT data")
          return()
        }
        
        # select the gmt data for the serum group
        gmts <- gmt_data %>%
          filter(sr_group == srg)
    
        gmts <- gmts[match(rownames(coords), gmts$ag_name),]

        # plot GMTs
        data3js <- r3js::lines3js(
          data3js,
          x = rep(coords[j, 1], 2),
          y = rep(coords[j, 2], 2),
          z = c(0, gmts$gmt[j]),
          col = "grey50",
          toggle = sprintf("GMT, %s", srg),
          geometry = TRUE,
          opacity = 0.7,
          lwd = 0.2 
        )
        
        data3js <- r3js::points3js(
          data3js,
          x         = coords[j, 1],
          y         = coords[j, 2],
          z         = gmts$gmt[j],
          size      = 0.9,
          col  = lndscp_colors[srg, "Color"],
          toggle = sprintf("GMT, %s", srg),
          opacity   = 1 
          
        )
      }
      
    }

    if(show_landscapes){
      
      # Add GMT landscapes
    data3js <- lndscp3d_surface(
      data3js = data3js,
      object = lndscp_fit,
      crop2chull = FALSE,
      toggle = sprintf("GMT landscape, %s", srg),
      grid_spacing = 0.5,
      padding = 0.2,
      options = list(
        col.surface = lndscp_colors[srg, "Color"],
        opacity.surface = 1
      )
    )
    
    fit <- lndscp_fit
    
    # show individual landscapes
    if (show.individual.surfaces) {
      for (i in seq_len(nrow(fit$titers))) {
        individual_fit <- fit
        individual_fit$titers <- fit$titers[i, ]
        individual_fit$logtiters <- fit$logtiters[i, ]
        individual_fit$logtiters.upper <- individual_fit$logtiters.upper[i, 
        ]
        individual_fit$logtiters.lower <- individual_fit$logtiters.lower[i, 
        ]
        individual_fit$lessthans <- individual_fit$lessthans[i, 
        ]
        individual_fit$morethans <- individual_fit$morethans[i, 
        ]
        individual_fit$fitted.values <- NULL
        individual_fit$residuals <- NULL
        individual_fit$residuals.lessthan <- NULL
        individual_fit$residuals.morethan <- NULL
        if (!is.null(individual_fit$cone)) {
          individual_fit$cone$cone_coords <- individual_fit$cone$cone_coords[i, 
                                                                             , drop = F]
          individual_fit$cone$cone_heights <- individual_fit$cone$cone_heights[i]
        }
        data3js <- lndscp3d_surface(data3js = data3js, object = individual_fit, 
                                    crop2chull = FALSE, grid_spacing = 0.5, 
                                    padding = 0.1,
                                    options = options.individual.surfaces,
                                    toggle = "Individual landscapes")
      }
    }
      
    }
    
    
    
    }
 
  
  return(data3js)
}

# set orientation
set_r3js_orentation <- function(data3js, angle){
  r3js(
  data3js,
  rotation = angle$rotation,
  zoom = angle$zoom
)
}

Doing the landscape fits

# group multi exposure data by serum group
multi_exposure_data %>%
  group_by(
    sr_group
  ) -> titerdata

titerdata %>%
  group_map(
    get_titertable
  ) -> titertables


# do the fit per serum group on base map
lndscp_fits <- lapply(titertables, function(titer_table){
  ablandscape.fit(
           titers = titer_table, #errors can occur if the titertable for one serum group contains only one sample (then this sample is passed as a vector rather than a matrix) and if all titers in a sample are <LOD/NA. The titers are fitted, not the logtiters. <LOD values are estimated, but if all values are <LOD there is too little data to estimate from
           bandwidth = 1,
           degree = 1,
           method = "cone",
           error.sd = 1,
           acmap = map,
           control = list(
             optimise.cone.slope = TRUE
           )
         )
  
})


# calculate GMTs per serum group
titertables_groups <- group_data(titerdata)

Base map for antibody landscapes

# plot the base map
data3js <- base_plot_data3js(map, lndscp_fits, highlighted_ags = agNames(map), ag_plot_names = agNames(map), lims = lims_zoom)

set_r3js_orentation(data3js, angle)

Add 2 dose Pfizer GMTs

# plot 2x vax gmts
pfizer2_ind <- match("BNT/BNT", titertables_groups$sr_group)

# do lndscp colors in correct format
lndscp_colors <- sr_group_colors %>%
  column_to_rownames("SerumGroup")

gmt_2x <- plot_landscapes_from_list(data3js = data3js, titertables_groups = titertables_groups[pfizer2_ind,], lndscp_fits =  lndscp_fits[pfizer2_ind], map = map, gmt_data =  gmt_data, highlighted_ags = agNames(map), lndscp_colors = lndscp_colors, show_gmts = TRUE, show.individual.surfaces = FALSE, show_landscapes = FALSE)

set_r3js_orentation(gmt_2x, angle)

Add GMT surface

gmt_scape_2x <- plot_landscapes_from_list(data3js = data3js, titertables_groups = titertables_groups[pfizer2_ind,], lndscp_fits =  lndscp_fits[pfizer2_ind], map = map, gmt_data =  gmt_data, highlighted_ags = agNames(map), lndscp_colors = lndscp_colors, show_gmts = TRUE, show.individual.surfaces = FALSE, show_landscapes = TRUE)

set_r3js_orentation(gmt_scape_2x, angle)

Add individual subject surfaces

gmt_scape_2x <- plot_landscapes_from_list(data3js = data3js, titertables_groups = titertables_groups[pfizer2_ind,], lndscp_fits =  lndscp_fits[pfizer2_ind], map = map, gmt_data =  gmt_data, highlighted_ags = agNames(map), lndscp_colors = lndscp_colors, show_gmts = TRUE, show.individual.surfaces = TRUE, show_landscapes = TRUE)

set_r3js_orentation(gmt_scape_2x, angle)

Plot 2 doses vs. 3 doses landscapes

target_groups <- grep("BNT/BNT|BNT/BNT/BNT", titertables_groups$sr_group)[c(1,3)]

lndscps_plot <- plot_landscapes_from_list(data3js = data3js, titertables_groups = titertables_groups[target_groups,], lndscp_fits =  lndscp_fits[target_groups], map = map, gmt_data =  gmt_data, highlighted_ags = agNames(map), lndscp_colors = lndscp_colors, show_gmts = TRUE, show_landscapes = TRUE)

set_r3js_orentation(lndscps_plot, angle)

The antibody landscapes after different doses illustrate the increasing breadth after repeated exposure. The cone slope of each landscape is a measure of its breadth and can be accessed via the cone field of each antibody landscape fit, which also contains the fitted cone coordinates and cone heights per subject. The underlying geometry of the antigenic map - one antigenic distance unit corresponds to one two-fold change of neutralization titers - means that single exposure sera, from which the map was built, have a landscape slope around 1 on the log2 scale: With each antigenic distance unit away from the cone apex, the titers decrease by one two-fold. The slopes of multi-exposure sera are much lower than the slopes of single exposure sera:

slopes <- lapply(lndscp_fits, function(fit){
  round(fit$cone$cone_slope,1)
})

slopes_df <- data.frame("Serum group" = factor(titertables_groups$sr_group, levels = levels(srGroups(map))),
                        "Landscape slope" = unlist(slopes)) %>%
  arrange(Serum.group)

kable(slopes_df, col.names = c("Serum group", "Landscape slope")) %>%
  kable_styling(full_width = F)
Serum group Landscape slope
mRNA1273/mRNA1273 1.2
BNT/BNT 1.2
AZ/BNT 1.1
AZ/AZ 1.1
WT conv. 1.0
alpha conv. 1.3
beta conv. 1.5
delta conv. 1.2
BA.1 conv. 0.8
BA.2 conv. 1.3
Vacc+BA.1 0.7
Vacc+BA.2 0.5
BA.1 reinf. 0.6
BA.2 reinf. 0.4
BNT/BNT/BNT 0.7
Vacc+BA.1 reinf. 0.7
AZ/AZ+delta 0.8
BNT/BNT+delta 0.8

Plot all GMT surfaces

lndscps_plot <- plot_landscapes_from_list(data3js = data3js, titertables_groups = titertables_groups, lndscp_fits =  lndscp_fits, map = map, gmt_data =  gmt_data, highlighted_ags = agNames(map), lndscp_colors = lndscp_colors, show_gmts = TRUE, show_landscapes = TRUE)

set_r3js_orentation(lndscps_plot, angle)

Quality check for antibody landscapes

An easy first visual inspection is also the direct comparison between landscape fitted GMT and measured GMT:

# bind landscape fit and calculated gmt
combine_landscape_and_calculated_gmt <- function(lndscp_fits, gmt_data, sr_group_fields = 2){
  
  lndscp_gmts <- lapply(lndscp_fits, function(x){
    
    gmts <- data.frame(logtiter = x$fitted.value,
                       ag_name = names(x$fitted.value),
                       sr_group = str_split(rownames(x$logtiters)[1], "_")[[1]][sr_group_fields])
    return(gmts)
  })
  
  lndscp_gmts <- do.call(rbind, lndscp_gmts)
  
  
  ## compare lndscp gmts and claculated gmts
  comb_gmt <- rbind(lndscp_gmts %>%
                      mutate(Data = "Fitted Landscape GMT"),
                    gmt_data %>%
                      select(sr_group, ag_name, logtiter) %>%
                      unique() %>%
                      mutate(Data = "Calculated GMT"))
  
  
  return(comb_gmt)
}


comb_data <- combine_landscape_and_calculated_gmt(lndscp_fits, gmt_data %>%
                                                    mutate(logtiter = gmt) %>%
                                                    select(!gmt),
                                                  sr_group_fields = 2)


# plot comparison
comb_data %>%
  mutate(sr_group = factor(sr_group, levels = rownames(lndscp_colors))) %>%
    ggplot(aes(x = ag_name, y = logtiter, color = Data, fill = Data)) + 
    geom_line(aes(group = Data), position = position_dodge(width = 0.05)) + 
    geom_point(shape = 21, color = "grey20", position = position_dodge(width = 0.05)) + 
    scale_x_discrete(name = "Variant",
                     limits = agNames(map)) + 
    scale_y_continuous(breaks = seq(log2(16/10), 12, by = 2),
                     labels = function(x) ifelse(x == log2(16/10), paste0("<", 16), round(2^x*10)),
                     limits = c(-3, 12),
                       name = "GMT") + 
    annotate(
      "rect",
      xmin= -Inf,
      xmax = Inf,
      ymin = -Inf, 
      ymax = log2(16/10),
      fill = "grey50",
      color = NA,
      alpha = 0.2
    ) +
    facet_wrap(~sr_group,
               labeller = label_wrap_gen(multi_line = TRUE)) + 
    theme_bw() +
    theme(strip.background = element_blank(),
          axis.text.x = element_text(angle = 45, hjust = 1, size = 10),
          axis.text.y = element_text(size = 8)) -> p

p

Each ablandscape fit object contains a residuals field to check for error between measured logtiter and fitted landscape GMT for individual samples. The residuals are given on the log2 scale, so a residual of 1 corresponds to 1 2-fold change difference between measured titer and fitted landscape GMT.

# get residuals into long format
residuals_to_long <- function(residuals, values_name = "residuals"){
  
  ag_names <- colnames(residuals)
  as.data.frame(residuals) %>%
    rownames_to_column(var = "sr_name") %>%
    pivot_longer(cols = all_of(ag_names), names_to = "ag_name", values_to = values_name) -> residuals_long
  
  
  return(residuals_long)
}
  
# extract residuals from landscape fit function
combine_residuals <- function(fit, sr_group_fields = 2){
  
  
  residuals <- fit$residuals
  less_thans <- fit$residuals.lessthan
  more_thans <- fit$residuals.morethan
  
  long_res <- residuals_to_long(residuals, "residuals")
  long_less <- residuals_to_long(less_thans, "less_than")
  long_more <- residuals_to_long(more_thans, "more_than")
  
  serum_group <- paste0(str_split(long_res$sr_name[1], "_")[[1]][sr_group_fields], collapse = "_")
  
  comb <- long_res %>%
    left_join(long_less, by = c("ag_name", "sr_name")) %>%
    left_join(long_more, by = c("ag_name", "sr_name")) %>%
    mutate(measured = ifelse(is.na(residuals), ifelse(is.na(less_than), "more_than", "less_than"), "detectable"),
           residuals = ifelse(is.na(residuals), ifelse(is.na(less_than), more_than, less_than), residuals)) %>%
    select(!less_than:more_than) %>%
    mutate(sr_group = serum_group)
  
  return(comb)
}

# get root mean square error per variant
rmse_per_variant <- function(lndscp_fits, sr_group_fields = 2){
  all_residuals <- lapply(lndscp_fits, function(x) combine_residuals(x, sr_group_fields))
  
  all_residuals <- do.call(rbind, all_residuals)
  
  all_residuals %>%
    filter(!is.na(residuals)) %>%
    group_by(sr_group) %>%
    summarize(ag_name = "Total", 
              rmse = sqrt(sum(residuals^2, na.rm = TRUE)/(length(residuals))), # Residual standard error
              residual_type = "All variants") -> total_ag
  
  all_residuals %>%
    filter(!is.na(residuals)) %>%
    group_by(ag_name, sr_group) %>%
    mutate(rmse = sqrt(sum(residuals^2, na.rm = TRUE)/(length(residuals))),
           residual_type = "By variant") %>%
    plyr::rbind.fill(., total_ag) -> ssr
  
  return(ssr)
}

residuals <- rmse_per_variant(lndscp_fits, sr_group_fields = 2) %>%
  mutate(sr_group = factor(sr_group, levels = levels(srGroups(map))))

# plot residuals
residuals %>%
  ggplot(aes(x = ag_name, y = rmse, fill = ag_name)) + 
  geom_point(color = "grey20", shape = 21) + 
  scale_x_discrete(limits = agNames(map),
                   name = "Variant") +
  ylab("RMSE (log2(FC))") + 
  ylim(c(0,max(ceiling(residuals$rmse)))) +
  facet_wrap(~sr_group) + 
  scale_fill_manual(values = setNames(mapColors$Color, rownames(mapColors)),
                    name = "Variant") +
  theme_bw() +
  theme(strip.background = element_blank(),
        legend.position = "none",
        axis.text.x = element_text(angle = 45, hjust = 1)) -> rmse_plot_variant

# plot residuals
residuals %>%
  filter(ag_name == "Total") %>%
  ggplot(aes(x = sr_group, y = rmse, fill = sr_group)) + 
  geom_point(color = "grey20", shape = 21) + 
  ylab("RMSE (log2(FC))") + 
  ylim(c(0,max(ceiling(residuals$rmse)))) +
  scale_fill_manual(values = setNames(mapColors$Color, rownames(mapColors)),
                    name = "Serum group") +
  theme_bw() +
  xlab("Serum group") +
  theme(strip.background = element_blank(),
        legend.position = "none",
        axis.text.x = element_text(angle = 45, hjust = 1)) -> rmse_sr_groups


rmse_sr_groups + rmse_plot_variant + plot_layout(widths = c(1, 1.75))

Additional: Dealing with reactivity biases

Sometimes virus variants can exhibit high reactivity in an assay, meaning that measured titers against this variant are high due to assay, but not antigenic effects. It can also be the other way around, that a variant is low reactivity and titers against this variant are low in all samples. Judging whether a variant exhibits a reactivity bias is not always straight forward and requires external information, such as sequence substitutions.

The data set above contains the Gamma (P.1.1) variant, which was excluded in the demonstration above, because it exhibits a reactivity bias. Two factors in the current data set point towards this: Firstly, Gamma titers are higher than homologous titers in many sera, also in sera for which the homologous variant has distinct amino acids at antigenically important sites. Secondly, titers against the Beta (B.1.351) variant, which differs in spike from Gamma at position 417, are much lower in the same sera:

# read in data for multiple exposure landscapes
multi_exposure_data <- multi_exposure_data_b

multi_exposure_data %>%
  # log2 transform the data and set <LOD values to LOD/2
  mutate(logtiter = ifelse(titer == "<16", log2(16/20), log2(as.numeric(titer)/10)),
         sr_group = factor(sr_group, levels = sr_group_colors$SerumGroup)) -> data_long

# calculate
data_long %>%
  group_by(
    sr_group,
    ag_name
  ) %>%
  # if you want to use quick LOD/2 method for GMT calculation
  summarize(logtiter = mean(logtiter, na.rm = TRUE)) %>%
  mutate(logtiter = ifelse(logtiter < log2(0.8), log2(0.8), logtiter),
         sr_name = "GMT",
         gmt = logtiter)-> gmt_data

# plot the data
do_titerplot(data_long, target_sr_groups = unique(data_long$sr_group),
              gmt_data = gmt_data, 
             sr_group_colors = sr_group_colors,
             ag_order = agNames(og_map)) + 
  annotate("rect",
           xmin = "P.1.1",
           xmax = "B.1.351",
           ymin = -Inf,
           ymax = log2(0.8),
           fill = "red")

The Racmacs package has a function to deal with antigen reactivity biases. Reactivity adjustment are passed as a named vector on the log2 scale. So a reactivity adjustment of -1 means that all titers for this variant are multiplied with a factor of 0.5 (2^-1).

map <- og_map
# reduce P.1.1 reactivity by 1 dilution step. Alternatively, you can optimize an antigens reactivity using the function below. 0 = no adjustment, NA = optimization, value for numeric adjustment on the log2 scale
ag_reactivities <- rep(0, length(agNames(map)))
ag_reactivities[agNames(map) == "P.1.1"] <- -1
map <- optimizeAgReactivity(map, fixed_ag_reactivities = ag_reactivities, reoptimize = TRUE)

The adjusted titer data can be accessed using Racmacs’ adjustedTiterTable() and adjustedLogTiterTable() functions.